Finite Query Languages for Sequence Databases

نویسندگان

  • Giansalvatore Mecca
  • Anthony J. Bonner
چکیده

This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce infinite answer sets, since the universe of sequences is infinite, even for a finite alphabet. The challenge is to develop query languages that are both highly expressive and finite. This paper develops such a language. It is a subset of a recently developed logic called Sequence Datalog [19]. Sequence Datalog distinguishes syntactically between subsequence extraction and sequence construction. Extraction creates sequences of bounded length, and leads to safe recursion; while construction can create sequences of arbitrary length, and leads to unsafe recursion. In this paper, we develop syntactic restrictions for Sequence Datalog that allow sequence construction but preserve finiteness. The main idea is to use safe recursion to control and limit unsafe recursion. The main results are the definition of a finite form of recursion, called domain bounded recursion, and a characterization of its complexity and expressive power. Although finite, the resulting class of programs is highly expressive, since its data complexity is complete for the elementary functions.

منابع مشابه

انتخاب مناسب‌ترین زبان پرس‌وجو برای استفاده از فرا‌‌پیوندها جهت استخراج داده‌ها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES

Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...

متن کامل

Formal Languages and Algorithms for Similarity Based Retrieval from Sequence Databases

The paper considers various formalisms based on Automata, Temporal Logic and Regular expressions for specifying queries over finite sequences. Unlike traditional semantics that associate true or f alse value denoting whether a sequence satisfies a query, the paper presents distance measures that associate a value in the interval [0, 1] with a sequence and a query, denoting how closely the seque...

متن کامل

Sequence Datalog: Declarative String Manipulation in Databases

We investigate logic-based query languages for sequence databases , that is, databases in which strings of symbols over a xed alphabet can occur. We discuss diierent approaches to querying strings, including Prolog and Datalog with function symbols, and argue that all of them have important limitations. We then present the semantics of Sequence Datalog, a logic for querying sequence databases, ...

متن کامل

On the finite controllability of conjunctive query answering in databases under open-world assumption

In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries ...

متن کامل

Query Languages for Sequence Databases: Termination and Complexity

This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce in nite answer sets, since the universe of sequences is in nite, even for a nite alphabet. The challenge is to develop query languages that are both highly expressive and nite. This paper develops such a language as a s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995